Character-Level Linguistic Features Extraction for Text-to-Speech System

نویسندگان

  • Kuan-Hung Chen
  • Shu-Han Liao
  • Yuan-Fu Liao
  • Yih-Ru Wang
چکیده

High quality linguistic features is the key to the success of speech synthesis. Traditional linguistic feature extraction methods are usually relied on a word-level natural language processing (NLP) parser. Since, a good parser requires a lot of feature engineering to build, it is usually a genral-purpose one and often not specially designed for speech synthesis. To avoid these difficulties, we propose to replace the conventional NLP parser by a character embedding and a chacter-level  國立台北科技大學電子工程系 Department of Electronic Engineering, National Taipei University of Technology E-mail: { s970428, sam8105111}@gmail.com; [email protected]  國立交通大學電機工程系 College of Electricl and Computer Engineering, National Chiao-Tung University E-mail: [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Pattern Extraction Workbench Combining Multiple Linguistic Levels

In this paper an interactive pattern extraction workbench, I*Pex, is presented. The workbench comes in a graphical environment and is designed to be used in an incremental and interactive fashion with the user. Patterns can be constructed to work in combination involving specifications on several linguistic levels simultaneously, from the character level using regular expressions, parts of spee...

متن کامل

Towards a new level of annotation detail of multilingual speech corpora

The aim of this paper is to highlight the actual need for corpora that have been annotated based on acoustic information. The acoustic information should be coded in features or properties and is needed to inform further processing systems, i.e. to present a basis for a speech recognition system using linguistic information. Feature annotation of existing corpora in combination with segmental a...

متن کامل

Identification of Scene Text by Character Descriptor in Smart Mobile Devices

Abstract— Text data present in images and video contain useful information for automatic annotation, indexing, and structuring of images. Extraction of this information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given image. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image...

متن کامل

Towards a new level of anotation detail of multilingual speech corpora

The aim of this paper is to highlight the actual need for corpora that have been annotated based on acoustic information. The acoustic information should be coded in features or properties and is needed to inform further processing systems, i.e. to present a basis for a speech recognition system using linguistic information. Feature annotation of existing corpora in combination with segmental a...

متن کامل

Can characters reveal your native language? A language-independent approach to native language identification

A common approach in text mining tasks such as text categorization, authorship identification or plagiarism detection is to rely on features like words, part-of-speech tags, stems, or some other high-level linguistic features. In this work, an approach that uses character n-grams as features is proposed for the task of native language identification. Instead of doing standard feature selection,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJCLCLP

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2016